174
contrast, methods that first find out whether the protein sequence is similar enough to a
known structure and then predict the 3-D structure after “copying” it are surprisingly pow
erful due to the sheer size of the data (tens of thousands of known protein structures with
their x-y-z coordinates in the protein database).
Then, when the protein is completed to that point, its stability is determined by different
amino acid codons at the 3′-terminus. For example, there are specific instability sequences
at the C-terminus of the protein that determine its stability.
In general, it can be said that bioinformatics for deciphering these codes practically
always starts with the sequence, but then uses other features, especially the structure, but
for RNA, for example, the energy. For proteins, protein structure prediction is still compu
tationally time intensive and very difficult for new folding types. Also, the decoding of
transcription, DNA control sequences, and even new types of RNA (e.g., for lncRNAs
(long non-coding RNAs) and miRNAs (microRNAs), one must correctly predict their tar
gets) are only partially understood. On the other hand, increasingly complete large datas
ets of total transcription from a wide variety of cell types are available, gradually
supplemented by proteome datasets and metabolite data.
13.2
New Molecular, Cellular and Intercellular Levels and Types
of Language Are Emerging All the Time
The exciting thing, however, is that these types of languages are only the beginning. For
example, at the molecular level there is also a sugar code (glycosylations and these sugar
residue-binding proteins, so-called lectins), which regulates, among other things, which
cells come together to form tissue associations and, for example, are simply ignored by the
affected cancer cells when metastases form. There are also other codons for cell-cell com
munication (lipids, desmosomes and so on), until we finally arrive at one of the most com
plex systems of all, the immune system, which in each of us performs the task of reliably
distinguishing between self and foreign. There is already a great deal of data on the immune
system, for example on the white blood cells, where we can distinguish between lympho
cytes (antibody-producing B cells and directly defending T cells; the latter are subdivided
into helper cells, native killer cells and CD8 T cells and then into ever new subtypes), and
on other defence cells, in particular monocytes, dendritic cells and macrophages. But that’s
the beginning. The immunologist and immunologist distinguish very fine subtypes depend
ing on the surface receptors that white blood cells have and their specific subfunction. In
addition, there are platelets that also support the immune response. We study these cell
types intensively and find that for each of these defense cells, again, you can make a sepa
rate systems biology model. The language diversity and complex coding of the various
immune responses are only surpassed in complexity by our nervous system. Both systems
have only been deciphered in their various codes and language levels in rough outline. So
there are still many open questions and exciting secrets that still want to be deciphered.
In evolutionary terms, the different levels of the languages of life can be simplified as
shown in the box: Starting from preforms of life (about 3.3 billion years ago), as is still the
13 Life Invents Ever New Levels of Language